Coping With Ambiguity in a Large-Scale Machine Translation System
ثبت نشده
چکیده
In an interlingual knowledge-based machine translation system, ambiguity arises when the source 1.qnguage analyzer produces more than one interlingua expression for a source sentence. This can have a negative impact on translation quality, since a target sentence may be produced from an unintended meaning. In this paper we describe the ,nethods nsed in the KANT machine translation system to reduce or eliminate ambiguity in a large-scale application domain. We also test these methods on a large corpus of test sentences, in order to illustrate how the different disambiguation methods redtuce the average number of parses per sentence, 1 I n t r o d u c t i o n The KANT system [Mitamura etal . , 1991] is a system for Knowledge-basexl, Accurate Natural-language Translation. The system is used in focused technical domains for multilingual translation of controlled source language documents. KANT is an interlingua-based system: the sonrce language analyzer produces an interlingua expression for each source sentence, and this interlingua is processed to produce the corresponding target sentence. The problen3 el' ambiguity arises when the system produces more that~ ()tie interlingua representation for a single input sentence. If the goal is to automate translation and produce output that does not require post-editing, then the presence of ambiguity has a negative impact on translation quality, since a target sentence may he produced from an unintended meaning. When it is possible to limit tile interpretations of a sentence to just those that are coherent in the translation domain, then the accuracy of the MT system is enhanced. Ambiguity can occnr at different levels of processing in source analysis. In this paper, we describe how we cope with ambiguity in the KANT controlled lexicon, grammar, and semantic domain model, and how these :ire designed to reduce or eliminate ambiguity in a given translation domain. 2 C o n s t r a i n i n g t h e S o u r c e T e x t The KANT domain lexicon and grammar are a constrained subset of the general source language lexicon and gra,nmar. The strategy of constraining the source text has three main I
منابع مشابه
Coping With Ambiguity in a Large-Scale Machine Translation System
In an interlingual knowledge-based machine translation system, ambiguity arises when the source language analyzer produces more than one interlingua expression for a source sentence. This can have a negative impact on translation quality, since a target sentence may be produced from an unintended meaning. In this paper we describe the methods used in the KANT machine translation system to reduc...
متن کاملCoping with Ambiguity in Knowledge-based Natural Language Analysis
This paper describes the strategies and techniques used by the English analysis component of the KANT Knowledgebased Machine Translation system to cope with ambiguity. The constraints for elimination of ambiguity are distributed across the various knowledge sources in the analyzer. As a result, efficiency in the analysis component is maintained, and output quality is improved.
متن کاملpsychological coping styles and tendency to addiction in teenagers: the Mediating role of resilience and ambiguity.
The purpose of this study was to investigate the relationship between psychological coping styles and tendency to addiction in teenagers: Mediating role of resilience and ambiguity. Design of present study is correlational. For this purpose, from the statistical population of the high school students, 200 were selected by Multi-stage cluster random sampling and then the Endler and Parker...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملResolving Structural Transfer Ambiguity in Chinese-to-Korean Machine Translation
We propose a new prototype Chinese-to-Korean machine translation system called TOTAL-CK, a prototype transfer based MT system designed for the large-scale practical domain. TOTAL-CK consists of the components of analysis, transfer, and generation. In this paper, we mainly discuss the transfer issues resulting from stylistic structural differences between Chinese and Korean. The dependency gramm...
متن کامل